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The Molecular Biology of HIV-1 

Jean-Pierre Vartanian and Simon Wain-Hobson 



Unite de Rdtrovirologie Moleculaire, Institut Pasteur, 
75724 Paris cedex IS, France. 



; No group of RNA viruses has as diverse a genetic organization or as plastic a ge- 
I nome as the retroviruses. Their abiHty to transduce, or steal, genetic material from 
-Ithe host cell genome — a remarkable feat of genetic engineering — gave rise to the 
TOncogene, a discovery that has revolutionized cancer research. With the identifica- 
>llon of HIV as the etiologic agent of AIDS, retrovirology entered its renaissance. 
|Many retroviruses may still be discovered, just as HIV emerged unexpectedly and 
.^dramatically. The recent description of a novel retrovirus associated with solid tu- 
i^mors in fish is just such an example (1). 

^Retroviruses have been conventionally divided into three subfamilies based on 
iJfologic and morphologic criteria: oncoviruses, spumaviruses, and lentiviruses. This 
classification has since become outdated, as retroviruses have been found to be more 
tJf^pniplex and varied than first thought. 

4:,The oncoviruses are tumor-promoting viruses that can be divided into three or 
pur distmct groups, not all of which induce malignancies. For example, the onco- 
^B^nic potential of the Mason-Pfizer monkey virus (MPMV), a D-type retrovirus, was 
iJCW^l without success (2). Yet the immunosuppressive effect of the virus in juvenile 
|monkeys was overlooked. It was not until 1984, when an MPMV-like D-type retro- 
Ijnmwa^ isolated fi-om rhesus monkeys with profound immunosuppression, that this 
|»»^t or D-type retroviruses became apparent (3). 

^ihmi"h' ^^^^"f'^' known about the spumaviruses, or foamy viruses. Al- 

'^SeTt^i^^'^^ learned about these viruses, an association with disease has 

'^^^h fh^^ ^'^^""'V' retroviruses are not always pathogenic. 

'I Dhvl sequence data now available, it has been possible to construct 

. Pnj^ogenetic tree for the retrovirus family (Fig. I). Certain groups— such as the 
v^^^^^^^ni-c ymphotropic virus (HTLV) group of oncoviruses and the spumavi- 
liiSor^^ defined. Other groups, such as the lentiviruses, are diverse. Even 
^'fe^hi^lv ^'^•"^/^ lentiviruses, the genetic organization of other parts of the genome 
/iftjun^ol^^'^'^ Recombination between retroviruses of two different 

^Finall complicating classification (5). 

OSes uJ^ 'm ^P^*"^^"^ to forget the endogenous retroviruses. These are pro- 
«>me Th invariably defective, that replicate along with the host cell 

the B examples akin to the murine and avian leukemia viruses as well 

y of the il^t— D-type retroviruses. Endogenous retroviruses homologous to 
Pmiviruser*''"''''' spumaviruses, or HTLV group do not appear to exist. 

^. s, named for their long incubation period between infection and overt 
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LENTIVIRUSES 



MoMLV 
HERC 
HSRV 

FIG 1 A Dhvloqeny of retroviruses based on reverse transcriptase sequences An asterisk 

SestmorSiation. HTLV-1 and HTLV-2. human T-ceniymphoU^^^^^^^ 

9- Ri V hovine leukemia virus- MMTV, mouse mammary tumor virus, HERK, numan enaoge 

Ls y^Ss K fS" lAP hamsterintracisternal A-type partlcle RSV humar; 

siv simian immunodeficiency virus from African green monlteys; HIV-1 and H'V-2 human 

immrriodeficiency virus types ( and 2; EIAV. e^-- 

maedi/visna virus; MoMLV, Moloney munne leukemia v rus, ^^^J' '^"^f" 

rovirus C-type family: HSRV, human spuma or foamy retrovirus. Adapted from ref. 125. 



disease, have, until AIDS, been neglected despite their association with neurologic 
pulmonary, arthritic, and hematologic diseases in outbred animal populations {6)_ 
The prototype lentivirus-the visna virus-was first reported m 1960 as the cause of 
an epidemic of deaths among Icelandic sheep in the 1930s and 1940s (7). Visna. 
which means "wasting" in Icelandic, primarily causes neurologic disease and pul- 
monary lesions. Lentiviruses were later discovered in goats-the capnne arthru.s 
encephalitis virus-and in horses-the equine infectious anemia virus (8,9). The for- 
mer provokes leukoencephalopathy in kid goats and chronic arthritis in adult goats, 
the latter causes anemia in horses. Their lack of association with tumors led lenti- 
viruses to be neglected, however. For example, the bovine i-^-^^^f'^'^"^^ ^7"^' 
a milk-borne lentivirus, was first isolated in 1972, yet ignored until 1987, after the 
AIDS viruses had been well established (10). , n in IQ8^ rTa- 

Lentivirology changed dramatically with the discovery °f ^' j^'" 
ble 1) Within five years the simian immunodeficiency virus (SIV) had been isolated 
from macaques (12). African green monkeys (.3). — V/^ ^^^^^Ta^^SS 
(15 16) and chimpanzees (17). A second human virus, HlV-2 (18.19). was 'dentinea 
n We Africa, and the discovery of the feline immunodeficiency virus, a widdy 
diffused virus of cats, soon followed (20). It is likely that more Antiviruses will be 
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HIV-1 



gag 



pol 




SIVcpz 



I HIV-2 




SIVmac 




^$IVagin 



vpx vpr 




vpr 



iSIVmnd 




vpr 



^.2. Organization of the coding potential of primate immunodeficiency viruses. The LTR 
. uCTures that flank the orts have been omitted for clarity. The vpu orf specific to HIV-1 and 
aam"? ^"1^^® orf specific to HIV-2 and SIV^3, are worth noting. The abbreviations are the 
«Tie as those in Tables 1 and 3. 



overed in the years to come. Ironically, lentivirology has begun to eclipse re- 
on Qthgj. retroviruses, and AIDS-related research is now a driving force in 
edical research. (See Table 2 for a general comparison of characteristics of sev- 
^^""n^an retroviruses.) 

e few lines devoted here to these viruses are not simply for historic interest, for 
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TABLE 1. 


Mammalian lentiviruses 




Lentivirus 




Rpfprpnrp 


Ovine maedi/visna vims (OMVV) or 


Pulmonary (maedi) and neurologic 


7 


visna virus 


(visna) 


8 


Caprine arthritis encephalitis virus 


Leukoencephalopathy in kid goats and 


(CAEV) 


arthritis in adult goats 




Equine infectious anemia virus 


Infectious anemia 


9 






10 


Bovine immunodeficiency virus 


Unknown 






11 


Human immunodeficiency virus 


AIDS 


type 1 {HtV-1) 




12 


Simian immunodeficiency virus of 


AIDS 


captive macaques (SlV^ac) 




18,19 


Human immunodeficiency virus 


AIDS 


tvoe 2 (HIV-2) 


Unapparent disease so far* 


15,16 


Simian immunodeficiency virus of 


sooty mangabeys (SIV,„) 
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Feline immunodeficiency virus (FIV) 


AIDS 


Simian immunodeficiency virus of 


Unapparent disease so far 


13 


African green monkeys (SIVagJ 


Unapparent disease so far 


14 


Simian immunodeficiency virus of 


mandrills (SIV^) 




17 


Simian immunodeficiency virus of 


Unapparent disease so far 


chimpanzees (SlV^pJ 







'"Unapparent disease so far" does not mean that the viruses are apathogenic, but that data gath- 
ered either in the wild or with adequate follow-up time in captivity are insufficient to discern the patho- 
genic potential of these viruses. 



in comparing their genetic structures (Fig. 2) we may start to appreciate the extraor- 
dinary plasticity of the retroviral genome. The comparison also helps us to realize 
which viruses constitute genetically correct models of human lentiviruses. Thus, 
none of the ungulate viruses has as complex a genetic structure as HIV-1 and HIV- 
2, demonstrating that, although they may provide powerful insights into the mech- 



TABLE 2. Characteristics of various tiuman retroviruses 



Retrovirus 


Subfamily 


Clinical 
Characteristics 


Major 
Target Cell 


Characteristics of 
Virus Particle 


HTLV-I 
HTLV-II 


Oncovirinae 
Oncovirinae 


ATL 

HAM/TSP 

Hairy cell 
leukemia (?) 


CD4(*) Tcell 
(non-lymphoid 
celts in vitro) 

CD8(*) Tcell in 
vivo 


80-120 nm enveloped 
spherical electron- 
dense core 

80-120 nm enveloped 
spherical electron- 
dense core 


HFV 


Spumavirinae 


None established 


Multiple ceil 
types 


100-140 nm 
enveloped spherical 
core 

80-120 nm enveloped 
cylindrical electron- 
dense core 

80-120 nm enveloped 
cylindrical electron- 
dense core 


HlV-1 
HIV-2 


Lentivirinae 
Lentivirinae 


Immunodeficiency 
Immunodeficiency 


CD4(*) Tcell 
monocyte/ 
macrophage 

CD4(*) Tcell 
monocyte/ 
macrophage 



ATL = adult T-cell leukemia; HAM/TSP - HTLV-l-associated myelopathy/tropical spastic parapa- 
resis; HFV = human foamy virus 
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anisms of lentiviral disease, their value is somewhat circumscribed. Indeed, the in- 
cubation time from infection to overt clinical disease may take an animal's lifetime. 
The signal importance of the macrophage in these infections was first documented 
for the ungulate lentiviruses (21). 

The chimpanzee lentivirus, SIV^pj (22), is the primate counterpart of HIV-1, 
whereas lentiviruses in the macaque (SlV^nac) sooty mangabey (SIV^^) are closest 
to HIV-2 (23,24). Given the problems inherent in using chimpanzees for experimen- 
tal research, including their paucity, we are effectively without a manageable animal 
model for HIV-1 infection. The SlV^a^/sm model and the HIV-2-adapted infection of 
macaques clearly hold tremendous promise for ultimately understanding the HIVs 
and AIDS pathogenesis. (See Chapter 5, this volume.) 

We will devote the rest of the chapter to the biology of HIV-1. (Throughout the 
chapter "HIV will refer to HIV-1 unless specified as HIV-2.) We will discuss virus- 
host interactions, the replicative strategy of the virus, the assembly of the virion, 
and the tremendous genetic diversity of the virus. 



Retroviruses are enveloped viruses with two copies of encapsidated plus stranded 
genomic RNA. The lipid membrane is of cellular origin and is acquired during the 
budding of the virus from the cell surface. Table 3 shows the viral proteins along 
with their principal functions. Figure 2 illustrates the position of the gene encoding 
each protein and Fig. 3B illustrates the position of each protein in virus particles. 

HIV has two virions: the ^'conventional" virion (Fig. 3A and B) and the latently 
infected lymphocyte. Both are probably transmitted in all forms of infection, with 
the exception of cell-free blood products. The proportion of cell-free virus in the 
blood is substantial just before seroconversion, after which it declines to low levels 
(25,26). The titers then rise with declining clinical status. Interestingly, much cell- 
free virus is associated with antibody. The relative contribution of cell-free or cell- 
associated virus to natural infection is not known. The latently infected lymphocyte 
is activated in the new host by a mixed lymphocyte reaction resulting in the produc- 
tion of numerous conventional virions. 

A combination of studies based on polymerase chain reaction (PGR) and in situ 
hybridization has shown that whereas the number of peripheral blood mononuclear 
cells (PBMCs) supporting active replication is of the order of 1/100,000 (27), the 
number of latently infected PBMCs can vary from 1/50,000 to 1/100 depending on 
disease stage (28,29). In vivo, HIV is found essentially in CD4'*' lymphocytes, and 
then mainly within CD4'^ memory cells (30). Antigen-presenting cells, monocytes, 
macrophages, Langerhans cells, Kupffer cells, and microglial cells also can be in- 
fected. Although there are some reports of other cell types naturally harboring HIV, 
there is no good consensus on this point. In vitro, however, HIV can infect cell lines 
of lymphoid, hepatocyte, and fibroblast types, suggesting another mechanism for 
virus-cell interaction and infection. A novel galactocerebroside molecule has been 
identified as the receptor for HIV on CD4" glial and neuroblastoma cells (31). The 
dissociation constant (Kd) for this interaction was surprisingly low (10"^M). The 
relevance of the preceding in vitro data to natural infection is not clear, however, 
and these data probably should be treated with caution. The viral long terminal re- 



THE VIRUS AND THE CELL 



Cell Tropism 
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TABLE 3, HIV proteins and their known characteristics and functions 



Structural 

Gag Pr55 

p17MA 

p24CA 
p14NC 



Env gp160 
gp120 

gp41 
Enzymatic 

Gag-Pol Pr180 

p12PR 
p66RT 

p321N 
Regulatory 
Tatp16/p14 
Rev p19 
Accessory 
Vif p23 
Vpr p10-15 
Vpu pi 6 

Nef p27 

Vpx p12-16 



Structural nucleocapsid precursor; NH2 myristytated, directing it to inner 

cytoplasmic membrane; cleaved by viral protease 
Gag matrix protein; core protein anchoring virion envelope via NH2 

myristylated terminus. 
Gag capsid protein within the pi 7 shell 

Gag nucleic acid binding protein; presumably condenses RNA genomes 
within virion; frequently cleaved into p7 and p6, with p7 retaining 
binding capacity 

Highly glycosylated precursor to env products 

Surface glycoprotein, >50% carbohydrate; binds to HIV receptor CD4* 
molecule 

Transmembrane protein; amino terminus fuses with plasma membrane 

Precursor encoding viral enzymes in form of polyprotein; cleaved pol in 

budding and immature complete virions 
Viral proteinase belonging to the aspartic acid group; ensures cleavage 

of the gag and gag-pol precursors: functions as a dimer 
Mature reverse transcriptase/RNaseH; functions as a dimer; one 

subunit is cleaved by the viral protease, leaving a functional p66:51 

heterodimer 

Viral endonuclease/integrase results in integration of the provirus 

Viral transactivator; nuctear/nucleolar localization; binds TAR 
RNA transport phosphoprotein, primarily nucleolar; binds PRE 

Viral infectivity factor 
Virion associated 

Unique to HIV-1 ; helps intracellular transport of gp160 through Golgi 

and promotes release of virions of regular morphology 
Myristylated phosphoprotein associated with inner cytoplasmic 

membrane; necessary for high viral load in vivo 
Virion-associated protein unique to HIV-2 



TAR, tat response element; RRE, re i^- responsive element. 



peat (LTR) does not appear to play a role in determining cell tropism as was postu- 
lated for a variety of murine retroviruses. 



Receptor 

The receptor for HIV — and for all the primate immunodeficiency viruses — is the 
CD4^ molecule (32,33), a 55-kDa surface glycoprotein belonging to the immuno- 
globulin superfamily (34), CD4^ is recognized by the HIV surface glycoprotein 
gpI20, an interaction that is extremely specific and tight yet depends on the isolate 
used. Thus, the Kd is of the order of 4 x IO"^M (35). The three-dimensional structure 
of CD4^ is known (36,37), and the major residues involved in binding have been 
identified. They essentially map to residues 37 to 53, which form a loop at the surface 
of the membrane distal VI domain (38,39). 

Several hypotheses have been proposed for the entry of HIV into cells: the fusion 
of viral and cellular membrane in a pH-independent manner (40) by envelope fusion 
with the cell membrane; receptor-mediated endocytosis (41); or both fusion and en- 
docytosis. At some point fusion of the virion and host cell membranes or vesicles 
occurs, mediated by the hydrophobic terminus of the viral transmembrane protein 
gp41. Thus, recognition of CD4^ is essential yet insufficient for infection. 
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Core- 
shell 




MHC 



FIG. 3. The HIV-1 virion. A: Electron micrographs of HIV-1 budding from the cell surface mem- 
brane. The inset shows a perfectly symmetrical immature virion and a pair of mature, and 
hence infectious, virions. B: Sketch of the mature HIV-1 virion. The origin of the lateral bodies 
remains obscure. The core envelope link (CEL) may be mediated by p6 (LI link protein), al- 
though this is not proved. From ref. 103. 



Human T lymphocytes exist as long-term resting small lymphocytes and short- 
lived activated blasts. HIV infects only the latter, which are activated by mitogens 
in vitro and presumably by antigens in vivo. Of the few experiments reporting infec- 
tion of resting lymphocytes, extensive use was made of ultrasensitive PCR methods 
W2). It is difficult not to believe that a few donor PBMCs were dividing when blood 
^as drawn or that preparation of the PBMCs did not activate a few lymphocytes. 

The situation is different when it comes to antigen-presenting cells. HIV can enter 
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cells via the Fc receptor by internalization of antibody/virus complexes. Antibody- 
enhanced infection could be blocked by monoclonal antibodies to FcRIIl (43,44). 
HIV probably also enters via phagocytosis. The important point to retain is that 
antigen-presenting cells can produce HIV constitutively without the need for exter- 
nal stimuli probably because of activation of the precursor form of the NF-kB tran- 
scription enhancer by the viral protease (45). 



THE PROVIRUS 



Formation of the Provirus 



Once within the cytoplasm, reverse transcription of the single-stranded virion 
RNA into double-stranded proviral DNA occurs within a permeable core-like struc- 
ture The synthesis of the minus strand of DNA is initiated near the 5' end of the 
plus strand RNA by a specific tRNA primer, tRNAlys3 in the case of HIV-1, co- 
packaged in the viral particles, and it binds via its 3' end to a complementary se- 
quence—the primer binding site, or PBS— localized at the 3' end of the U5 region. 
It has been suggested that the structure of the tRNAlys anti-codon loop is an impor- 
tant factor in its recognition by HIV-1 reverse transcriptase (46). 

DNA synthesis proceeds up to the 5' end of the RNA genome. Reverse transcrip- 
tase has an intrinsic RNaseH activity— the ability to degrade RNA in the context of 
a DNA/RNA hybrid. RNA degradation directly follows DNA polymerization, leav- 
ing 14 to 18 bases of complementary RNA and DNA double stranded. The resulting 
single-stranded DNA may jump to the 3' terminus of the RNA molecule and com- 
plete minus strand DNA synthesis. Initiation of plus strand DNA synthesis is me- 
diated by a polypurine RNA primer left intact because RNaseH does not cleave 
between purine residues. For most retroviruses, this polypurine tract (PPT) is lo- 
cated just to the 5' end of U3. The double-stranded linear provirus is completed as 
shown in Fig. 4. 

HIV, however, undergoes its genetic metamorphosis from RNA to UNA ana oacK 
to RNA, with one nuance peculiar to lentiviruses and spumaviruses. Plus strand 
DNA synthesis is directed principally by two polypurine tracts (PPT) rather than 
one. The second PPT is located in the middle of the genome. DNA initiation is even 
more efficient from this site as opposed to the classical site just 3' to the U3 region. 
In the unintegrated genome this replication strategy results in a single-stranded re- 
gion in the center of the provirus. In fact, such a gapped structure was first described 
in 1981 for the visna virus (47,48). This partly single-stranded, partly double- 
stranded genome is reminiscent of the hepadna viruses. The final product is a 
double-stranded linear DNA that contains two copies of the LTR. The provirus is 
translocated to the nucleus where integration into the host genomic DNA occurs 
essentially randomly. 



Organization of the Provirus 

As Fig 4 shows, replication results in two genetic metamorphoses: the revei 
transcription of RNA into a double-stranded provirus and the reorganization of t 
genetic message. Thus, the U3 and U5 sequences are duplicated and reorganized. 
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R U5 PBS gag pol env PPT U3 R 

^ AAAAA 3' viral RNA(+) 

tRNA primer 



R U5 PBS gag pol env PPT U3 R 
i> -r.^ — ^AAAAA 3' viral RNA(+) 



1st JUMP 



3* 



PBS* 



PBS^ ^ gag ^ _P2l _ U3 R 

3* ^ 



gag' por 



- -AAAAA 



U3 R U5 PBS 



PBS' gag* pol' env' PPT*U3'R*U5' 



U3 R U5 PBS 



env' PPTUS'R'US' 



DNA (-) 



DNA (+) 
DNA (-) 



DNA (+) 
DNA (-) 



circularizationJF^ 2nd JUMP 



U3RU5PBS 




5' 



U3 R U5 PBS 



PBS' gag' pol' env' PPT'U3'R'U5' 



J 5' 



'5' 



U3RU5PBS gag pol env PPT U3 R U5 



'5* 



FIG. 4. Replication strategy of a retrovirus from the encapsidated RNA genome throuqh to the 
aouble-stranded unintegrated provirus. 



addition, only the 5' copy of the R elenient of the LTR is reverse transcribed, al- 
though it is present twice in the RNA genome. 

As with ali retroviruses, the LTR structures of HIV harbor all the sequences nec- 
essary for proviral transcription and termination. However, the lentiviruses are com- 
plex in that, once transcribed, the fate of the fuIJ-length transcript is highly con- 
trolled, resulting in a temporal expression of certain proteins, (This extra layer of 
control, which involves sequences that map within coding sequences, will be dealt 
With m a subsequent section.) It is impossible to use '^gene" in the normal sense 
When referring to retroviruses, because a single full-length transcript of the provirus 
>s spliced into more than 20 mature mRNAs (49). In addition, at least two mRNAs 
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are bicistronic (50). For clarity, reference will be made to introns, exons, and open 
reading frames (orfs). An orf, which is part of one of three reading frames by which 
a ribosome can translate RNA, signifies an absence of stop codons and hence a 
protein coding potential. Between the two LTRs are nine orfs that encode 14 mature 
proteins. 

There are four types of proteins (Table 3, Fig. 5): the viral structural proteins 
encoded by the gag and env orfs; the viral enzymes (pot orf); two proteins intimately 
associated with viral expression and its regulation. Tat and Rev; and a number of 
accessory proteins, notably Vif, Vpr, Vpu, and Nef. 

Several secondary RNA structures are essential in the control of proviral expres- 
sion, translation, and packaging. Furthermore, a large number of splice donor and 
acceptor sites overlap coding sequences. The importance of these features will be 
explored in subsequent sections. 



REGULATION OF PROVIRAL TRANSCRIPTION 

The transcription of the HIV provirus and the temporal ordering of splicing rep- 
resent two profoundly original aspects of HIV virology. With hindsight it becomes 
clear that HIV has brilliantly solved two biologic conundrums. The first is how to 
survive in a lymphocyte that spends most of its time as a nondividing, resting lym- 




FIG. 5. HIV-1 proteins derived from polyprotein precursors. The Gag and Gag-Pol polyprotein 
precursors are cleaved by the viral proteinase p9. Cleavages of the envelope signal peptide 
and between gp120 and gp41 are the result of cellular enzymes. Myr indicates the myristyla- 
tion at the amino terminus of the Gag and Gag-Pol polyproteins. 
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phocyte and only a small fraction of its time as an activated lymphoblast. The second 
is how to ensure the expression of nine orfs from one full-length genomic RNA. The 
solutions the virus uses for these events involve novel viral proteins called Tat and 
Rev. 



Transcription Activation 

For convenience, pro viral transcription will be considered as starting from a la- 
tently infected lymphocyte. In a resting cell, it is presumed that there are no tran- 
scripts — presumed because it is virtually impossible to examine truly resting cells. 
Upon activation of a resting lymphocyte, the cell starts to produce many growth 
factors. The HIV LTR is susceptible to a number of these. 

Transcription is initiated within the most 5' LTR and terminates within the 3' LTR. 
Transcription from the 3' LTR is blocked by some unknown phenomenon. The 3' 
LTR becomes functional, however, as soon as it is uncoupled from the 5' LTR. The 
HIV-1 LTR is divided into three domains: U3, R, and U5. By definition, transcrip- 
tion initiation starts at nucleotide -h L All the c/.y-acting elements involved in HIV 
expression are localized in the U3 and R region. DNasel footprinting of the HIV-1 
LTR revealed four regions within the U3 R region that served as binding sites for 
numerous cellular proteins such as NF-kB, EBP-l, Spl, CTF-l/NF-1, TATA binding 
protein, LBP-1, and UBP-1 (51,52). The target sequences for these proteins have 
been confirmed by mutagenesis and deletion studies. There are two copies of an NF- 
kB target sequence, or enhancer, and three of the target for the Spl factor. 

The NF-kB enhancer sequence is perhaps a misnomer in that the two copies may 
be deleted from a functional provirus with little consequence. Deletion of the se- 
quences from an isolated LTR would substantially reduce transcription, indicating 
the need for functional studies to be carried out on the complete provirus. Such 
studies indicated that the juxtaposed 3' NF-kB and 5' Spl sites were somehow inti- 
mately responsible for most of proviral transcription mediated by these two factors. 
Interestingly, the targets for CTF-l/NF-1 and LBP-1 beyond the transcription initi- 
ation site suggest that they may modulate the transcription complex more directly. 

Other binding sites for eucaryotic transcription factors have been identified, in- 
cluding those for AP-1 , NFAT-1 , USF, NF-l , and SP50 (for reviews, see refs. 51 and 
52). The sequence between -420 and — 154 has been shown to contain a functional 
negative response element (NRE). The role of NRE in HIV replication has not been 
clearly defined. 

The LTR may be activated in vitro by various T-cell mitogens such as phorbol 
esters, butyrate, interleukin 2, PHA, and PMA. Presumably the lymphocytes are 
activated principally by antigen in vivo. The LTR also may be activated by regula- 
tory proteins of several DNA viruses including herpes simplex virus types 1 and 2, 
^enovirus, cytomegalovirus, hepadnavirus, pseudorabies virus, Epstein-Barr vi- 
jys, human herpesvirus 6, and papovavirus. In addition, the transcription transac- 
vaior proteins Tax from HTLV-I and the spumaviral Bel 1 protein also may activate 
I vitro. Whether these phenomena are meaningful in vivo remains 

IvJr^ ^^^^Wished. For these phenomena to be true, the silently infected resting T 
Pnocyte must be superinfected by a second virus, which is theoretically possible 
"H* • ^' cytomegalovirus, and perhaps certain adenoviruses. Finally, ultraviolet 
?aiation also can induce HIV-I transcription (53). 
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Once activated, the proviral transcription proceeds. Transcription is not efficient 
even in activated cells, with most transcripts terminating around +59 from the tran- 
scription start site, giving rise to a double-stranded RNA stem and loop structure 
called the tat response element (TAR). A small proportion goes beyond this point, 
resulting in full-length transcripts. These are rapidly spliced into small molecular 
weight mRNAs encoding principally tat and rev, two small yet complicated proteins 
whose function is still not completely understood. 

Amplification of Transcription 

Tat is a 86-IOO-residue protein of approximately 15.5 kDa; however, its first 56 
residues are sufficient for full activity (54). Tat is a nuclear/nucleolar protein (55) that 
binds to a specific motif, called the bulge, in the TAR region of nascent RNA. A 
highly conserved cluster of seven cysteine residues has been implicated in dimeri- 
zation via two zinc metal ions (56). The cysteine motif in the Tat protein differs 
significantly from the classic description of a zinc finger binding domain but is not 
unlike the cysteine-rich clusters of other metalloproteins. It is unclear whether this 
property is important for Tat function in vivo, as other reports have Tat active as a 
monomer (57). 

Tat becomes a potent transactivator of viral gene expression via binding to the 
TAR region, a stem and loop structure in the R region of RNA. Both the location 
and orientation of TAR are critical for function (for reviews, see refs. 51 and 52). 
Whereas Tat binds to the bulge, a cellular protein is believed to bind to the loop, A 
number of phenomena then occur: the viral transcription initiation increases, the 
short transcripts elongate, and the initiation complexes stabilize. The increase in 
viral mRNA and protein synthesis stimulated by tat is greater than 100-fold in human 
HeLa cells or lymphocytes. 

It has been suggested that transactivation by Tat is partly post-transcriptional. 
Using Xenopus oocytes. Tat was shown to activate the expression of presynthesized 
TAR mRNA despite the presence of transcriptional inhibitors (58). Yet the presence 
of TAR as a double-stranded RNA may inhibit the protein synthesis in trans. TAR 
can mediate the autophosphorylation and activation of the double-stranded RNA- 
dependent kinase (p68 kinase), which specifically can phosphorylate the eucaryotic 
initiation factor 2 (eiF-2), abolishing the mRNA translation. This area of research 
needs more elucidation. 

The dominant role of Tat is as a transcriptional activator. Of the two proteins made 
from small (1.8-2 kb) mRNAs, Tat and Rev, Tat acts first. Within two to three hours 
of infection the level of Rev becomes sufficient for it to alter viral expression pro- 
foundly (59). 

Coordination of Transcription 

Rev is a small molecule of some 1 16 residues (l9kDa) (60), of which the first 88 
are sufficient for activity. Like Tat, Rev is a nuclear protein primarily found in the 
nucleolus. Rev binds tightly [Kd —HnM] (61,62) to a complex RNA secondary/ter- 
tiary structure called the r^v-responsive element (RRE), which is 234 bases long. As 
Fig. 5 shows, RRE is located within the env orf, or in the intron located at the 3' 
end of the genome. A single Rev molecule may bind to RRE, resulting in multiple 
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activities. A number of domains include a highly basic sequence responsible for nu- 
clear localization (63) and RNA binding. Another region of rev is involved in multi- 
menzation (64). Although Rev is phosphorylated at two serine residues phosphor- 
ylation is not essential to rev function. 

Because RRE is located in the most 3' intron. rev cannot bind to the small mRNAs 
to influence their expression. By contrast, once bound to the larger RRE + mRNAs 
they can exit the nucleus into the cytoplasm, where they are promptly translated' 
The intermediate RRE+ mRNAs (4-5.5 kb) give rise to vif. vpr. tat, and vpu + env 
expression, and the largest 9-kb species— unspliced full-length genomic RNA— en- 
sures expression of gag and pol. Many spliced donor and acceptor sequences are 
located within the central region of the genome. By alternative splicing more than 
20 distinct mRNAs have been described (49,54). Many differ by small noncoding 
leader exons. The major mRNAs are shown in Fig. 6. 

Finally, an additional transcript that contains tat. env, and rev sequences and en- 
hote'ver (65) P'°'^'"-'^«^-has been described. Little is known about its role, 

Note that the structural proteins Gag and Env and the Pol-derived enzymes are 
produced later in the cycle. Such temporal regulation of expression may be likened 
f!- f °^ polyoma virus, adenovirus, or herpesvirus expression, 

iince this virus has a single operon, the Rev protein represents a means of achieving 
a balance between the expression of early and late transcripts 

Obviously, once Rev starts acting it has a positive effect on gag, pol, vif, vpr vpu 
and env expression as they are all made from RRE-H mRNAs (Fig. 6) Rev has a 
negative effect on its own expression and on that of «^/. Large transcripts are made 

^ "^"'^ °f ''"'^ °" expression, however, since 

^ there is a second tat mRNA, a 4.5-kb RRE + species (Fig 6) 




gag+gag-pol 
vif 



vpu+env 



> - IZ 

^■^WrSTSaKonrS*^" for HIV-1. Only the major species have been shown. A 
I ref 49 ^^'^^ Rev-dependent mRNAs are those bearing RRE. Adapted 



• tat-2 

• rev+nef 
■ nef 
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To summarize: first small transcripts are produced ensuring tat, rev, and nef 
expression. Two to three hours later there is enough Rev to ensure expression of the 
larger mRNAs (59). However, as tat is not down-regulated by Rev, the cell continues 
to churn out huge numbers of HIV transcripts, their fate being determined by the 
steady-state concentration of rev in the nucleus. 

It is a source of amazement that a remarkably complex virus should derive so 
many different mRNAs from a single initial transcript and still survive. Perhaps the 
extraordinarily powerful Tat-driven transactivation of the genome results in so much 
viral RNA that even dividing it by nine (for each orf) results in sufficient amounts of 
each specific RNA. 

The precise mechanism by which rev functions is not clear. It was suggested that 
Rev binding to RRE blocks splicing in the spliceosome (66). However, it is probably 
involved in transport between the nucleus and cytoplasm via the nucleolus. A recent 
report showed that human nucleolar B23 shuttle protein specifically binds rev. This 
association can be dissociated by RRE RNA in vitro and presumably in the nucleolus 
in vivo (67). It is as though rev has adapted to use a host cell nucleus/cytoplasm 
shuttle system. As Rev binds RRE in the nucleolus, B23 becomes displaced, and 
Rev/RRE+ mRNAs proceed into the cytoplasm. Rev then returns to the nucleus. 



Bicistronic mRNAs 

Two species of viral mRNA are bicistronic: the vpulenv and revlnefmKHAs (50). 
In both cases the first initiator methionine {vpu or rev) is in a "weak" context for 
efficient translation whereas the second initiator methionine {env or nef) is in a 
''strong" context. 



VIRION POLYPROTEINS 

Gag Nucleocapsid Protein 

The Gag polyprotein precursor is expressed from translation of full-length viral 
RNA. Pr55 gag is synthesized in the form of a 55-kDa precursor of approximately 
500 amino acid residues. The NH2 terminus being rapidly myristylated directs the 
precursor to the inner cytoplasmic membrane. Cleavage occurs during maturation 
of the complete immature virion and results in four proteins (Fig. 5). Pr55 Gag is 
cleaved uniquely by the HIV-1 aspartic protease. Both proteolytic cleavage and my- 
ristylation are required for the production of infectious virus particles. The NH, ter- 
minal protein, known as the pl8 MA or matrix protein, forms the inner protein "shell 
supporting the lipid bilayer. The major core protein, p24 capsid antigen (CA), self- 
aggregates to form the icosahedral inner shell within which the two RNA molecules 
are located. There are approximately 1,500 copies of CA per virion (68). The third 
Gag protein — pl7 NA, or the nucleic acid binding protein — presumably is responsi- 
ble for the condensation of two RNA genomes within the confined space of the virion 
core. P17 harbors a repetition of a zinc finger-like motif (CX2CX4HX4C) and is highly 
basic. Although its mobility on an SDS gel is approximately 17 kDa, its real molec- 
ular weight is 9.5 kDa. It also may be cleaved in two, giving rise to p7 and p6 pro- 
teins. All three proteins — pl7, p7, and p6 — may be found in the virion. P7 retains 
the basic region and the cluster of cysteine residues. Mutations of the cysteine and 
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histidine residues in pl7/p7 produce virions that are defective for packaging. The 
function of proline-rich p6 remains unclear, although mutations within block virus 
infectivity. Finally, defective HIV gag mutants of pl7 can dominantly interfere with 
the replication of the wild-type virus (69), suggesting a novel possibility for gene 
therapy. 



Pol-associated Enzymes 

Like gag, pol is expressed from full-length genomic RNA. The principal viral en- 
zymes are made as part of a 180-kDa gag-poI polyprotein precursor (Fig. 5). Pol is 
always expressed in concert with gag. This occurs within the region of overlap be- 
tween gag and pol orfs by a - 1 ribosomal frameshift during translation of genomic 
RNA. The site of the - 1 frameshift is specific to a run of U residues and relies on 
an adjacent stem and loop structure— sometimes called a pseudoknot structure— for 
positioning (70,71). Its efficiency is approximately 10% to 15%; that is, 0.1 to 0.15 
Gag-Pol polyproteins are synthesized for every Gag precursor. As the number of CA 
proteins has been estimated at approximately 1,500 per virion (68), the number of 
po/.derived molecules/virion is approximately 150 to 200. Pol encodes three distinct 
proteins: the protease, the reverse transcriptase/RNaseH, and the endonuclease/in- 
. tegrase. 

: The p9 protease, a member of the aspartic protease family, functions as a dimer. 
Withm the immature virion monomeric protease, in the form of a Gag-Pol polypro- 
tein precursor, dimerases result first in autocleavage and subsequent cleavage of 
other substrates. Cleavage of the Pr55 Gag and Prl90 Gag-poI proteins results in the 
. morphologic changes termed maturation as seen by electron microscopy. The three- 
dimensional structure of the p9 protease was the first to be determined for 
, the HIV proteins (72). It holds considerable promise for the design of anti-HIV re- 
agents. As mentioned earlier, the HIV protease is an aspartic protease (73) and has 
, a conserved pair of aspartic acid residues in the active site. Mutation of either results 
in immature and totally defective virions (74). Synthetic peptide analogues are potent 
i.nhibitors of purified HIV-I protease. 

The reverse transcriptase and RNaseH functions map to the NH, and COOH do- 
jnams, respectively, of the p66 protein. Reverse transcriptase can exist as a p66 di- 
mer or a p66:p51 heterodimer in which the RNaseH domain of one of the p66 sub- 
units IS cleaved away by the viral protease (75). Although both are functional in in 
iiro polymerization assays, the p66:p51 heterodimer is the most abundant form in 
.^we virion. Extensive mutagenesis studies have identified many key residues. Re- 
y^ntly a high resolution 3.5A crystal structure of the p66:p51 heterodimer has 
\^erged (76). The DNA polymerization and RNaseH active sites are clearly distin- 
guished (77,78), being separated by approximately 15 bases of duplex DNA, a value 
^ai supports the observations that RNaseH digestion of an RNA template succeeds 

^^^^ ^^s^s (^9)- Considerable conformational changes 

«^company duplex binding (78). 

p66 molecule comprises five distinct domains known as the finger, palm, 
^.^njb, connecting, and RNaseH domains. The p51 molecule is composed of only 
irst four. The most surprising finding is the remarkable structural asymmetry of 
iimh^ subunits. Even though p51 is derived from p66, the conformation of the 
jno and connecting domains is radically different. They effectively block the 
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DNA catalytic site of the p51 molecule. Thus, there is just one DNA catalytic site 
made up of a cleft formed by the finger, palm, and thumb domains of the p66 subunit. 
This loss of catalytic potential is more than offset by an increase in affinity of the 
heterodimer for the primer-template (80). 

Those residues linked with resistance to RTase-inhibiting drugs map, not surpris- 
ingly, around the DNA catalytic and substrate binding sites. While the 3. 5 A resolu- 
tion was insufficient to place the side chain residues precisely, it is certain that such 
work will lead to the conception of more and more RTase inhibitors. 

The most carboxyl-terminal of Pol proteins is the endonuclease/integrase, p32. 
This enzyme trims the ends of proviral DNA by two bases. The substrate for this is 
a covalently closed circular provirus with two LTRs. The form that is subsequently 
integrated, however, is a linear provirus devoid of the terminal dinucleotides. Chro- 
mosonial DNA is cleaved in a staggered fashion by the integrase, preferentially in 
the major groove on the surface of a nucleosome (81). Filling in results in a five-base 
pair duplication of cellular DNA. 

Envelope Proteins 

As the literature pertaining to the HIV- 1 envelope is voluminous, only the most 
salient points will be developed here (for a review, see ref. 82). The envelope poly- 
protein precursor, gPrl60 (Fig. 5), is heavily glycosylated. The protein precursor is 
progressively glycosylated as it passes through the endoplasmic reticulum, Golgi, 
and tranS'GoXgi, resulting in the addition of up to 70 kDa in carbohydrate. The amino 
terminal leader sequence is cleaved by a cellular protease associated with the dock- 
ing protein complex. Cleavage into the mature surface and transmembrane pro- 
teins — gpl20 and gp4l — occurs by the time the protein has reached the trans-Go\g\. 
probably by furin. Gpl60, like gpl20, is capable of binding the CD4^ receptor, which 
can occur in the rough endoplasmic reticulum. The situation is apparently saved by 
the HIV-1 Vpu protein, which helps to transport gpl60 out of the endoplasmic retic- 
ulum (83). 

Gp41 remains anchored in the membrane. The long intracytoplasmic tail of 150 to 
160 residues of gp41 is somewhat reminiscent of a cell surface receptor. It is present 
in most lentiviruses. with the exception of EIAV and FIV. Although the hydrophobic 
amino terminus of gp41 is involved in fusion between the virion and cellular mem- 
branes, the precise details are unknown. The first 129 amino acids of gp41 have been 
implicated in the assembly of Env protein tetramers (84). Indeed, cross-linking stud- 
ies have suggested that the Env naturally exists on the virion surface as a tetramer. 

The surface glycoprotein gpl20, which is responsible for CD4* receptor recogni- 
tion, is noncovalently associated with gp4l, although shedding of the gpl20 may 
occur within five hours of synthesis (85). Gpl20 is a 470-490-residue protein harbor- 
ing 22 cysteine residues, all of which form intramolecular disulfide bonds (Fig. 7) 
(82). Five hypervariable regions (VI-V5) are interspersed by constant regions. Four 
of the V regions represent disulfide bridged loops. Length varies considerably, par- 
ticularly within the VI, V2, and V4 regions. There are 21 to 27 N-linked glycosyla- 
tion sites, depending on the HIV-1 strain. For the single strain studied (86), all but 
one of the gpl20 sites were occupied. Thus, gpl20 is approximately 50% carbohy- 
drate. The diverse structures of the side chains, those terminating in sialic acid, are 
unusual: 1 1 were mannose rich whereas the remaining 13 were complex glycans (86). 
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There are no 0-lmked sugars^Much has been written about the biologically impor- 
lant regions of gp 120 although .n the absence of a three-dimensional sfructurrcau- 
Uon must be exercised when interpreting these data. A number of biolog Sy fr^. 
portant regions have been mapped, including some involved in CD4° rfcep or or 
gp41 binding (for more details, see ref. 82). iccepior or 

One region of particular importance is the Gly-Pro-Glv fGPG) motif in vi i 
(Fig. 7). Although it lies at the tip of a loop, the'three residues irrSg^y conlerled" 
Studies have suggested that proteolytic cleavaBP ii.«f i,»k;«w .u- '"Sn'y conserved. 
Uisite for cell-virion fusion. Researcr o date sul'es ^ha tSi ''^ ^ "i"'''"''- 

the vi.ro tropism of HIV. Thus mutationio gIg iS^^^ 
virus (87). If true in vivo, then it is particularly bad news sugSfZ Siv 
^broaden its tropism. A few naturally occurring^HIV variantreS^g fh ZatL" 
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in vivo are now known. Other studies suggest that this region is important in defining 
the preference of a macrophage for lymphocyte-adapted virus (88,89). With all this 
work, the problem of interpreting the validity of using molecular-cloned virus de- 
rived from a tissue culture-adapted strain is again evident. The use of established 
transformed immature lymphocytic and monocytic cell, lines further complicates the 
evaluation. The tremendous sequence variation inherent to gpl20 will be addressed 
in a subsequent section. 



ACCESSORY VIRAL PROTEINS 

None of the accessory proteins for HIV-l-Nef, Vif, Vpr, or Vpu-is essential for 
replication. However v//--negative, vpr-negative, and vpw-negative viruses replica e 
to distinctly lower titers in vitro. N^/-negative mutants, although indistinguishable 
from wild-type virus in vitro, have profound effects in vivo. This helps explain the 
frequent finding of lesions in these orfs in vitro. In a few cases lesions were noted in 
all four accessory gene orfs (90). 

The Vif protein is a 23-kDa protein (Table 3, Fig. 5) translated from a ^^v-depen- 
dent mRNA (Fig. 6). Although v,/is an acronym for "viral infectivity factor, the 
mechanism of its action is unclear. V(/--negative mutants yield low levels of infectious 
viral particles (91). V(f does not seem to regulate viral gene expression or be involved 
in assembly of virions. Vif may interact with the cytoplasmic domain of gp41. per- 
haps via a cysteine protease activity (92). 

Vpr is a conserved feature of all primate lentiviruses. The initial analysis of the 
SIV „ sequence described the genome as vpx+ vpr- (93). It has been cogently 
arguld, however, that based on a careful phylogenetic analysis of the sequences, this 
prlvinis actually encodes a vpx- vpr+ structure (94). In fact, it would appear tha 
vpx is a highly divergent yet duplicated vpr orf. Although vpr is ^ss^^^ljor 
HIV-l infectivity, replication, or cell killing, its exact function is not completely 
understood. Recently it has been suggested that vpr is an integral part of the virion 
(95). an intriguing hypothesis given that vpx is virion associated (96). 

Vpu is a nonglycosylated protein of 80 to 82 residues (97) unique to the HIV-l and 
SIV„, lentiviruses. Although these viruses are clearly related, there is just 36% se- 
quence conservation between them (22). The amino terminus of P 6 Vpu is hydro- 
phobic and probably membrane associated. Vpu is also phosphorylated. Vpw-nega- 
Uve mutantr decrease the amount of virus production in the culture superna ant. 
Virion morphology is highly irregular: diameter varies greatly and virions occasion- 
ally are binucleaSd. Despite this irregularity, the proportion of virus protein pro- 
duced by such mutants is comparable to that of wild-type virus. It has been sug- 
gested that Vpu may function as a matrix protein at the level of virus assembly (98) 
Vpu is not incorporated into the virion, however. Recently it has been shown to 
promote dissociation of intracellular gpl20/CD4- complexes (83). 

A voluminous, but until recently inconclusive, literature accompanies P27 Nef. It 
is localized at the inner cytoplasmic membrane because ^1^111^^^}^^^°^^^^ ^^'^ 
minal glycine residue after elimination of the initiator methionine. OTP bmdmg a"d 
GTPase activities proved fleeting, as did analogies with oncogenes 99^^ AUhough U 
was reported that Nef could be autophosphorylated or be a ^"^'^^•^^.'^ff; P™^^'" 
Wnase C at threonine 15. many ne/ isolates lack this residue, suggesting that phos- 
phorylation at this site may be gratuitous. 
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The acronym nef—for negative regulatory factor— ri^-Wv-^ 
that showed that «e/-negative mutant vV..^^^ iactor-<jenved from various studies 
better than wild-type vfru es TXas shownT^H""^ '""''''"^'"^ ^° '^^-^old 
through transcriptfo'al inhibUionffTh: ixHa thf NR^^ ' r'""'°" 
trolled studies have since been unable to substrntilt! JT^^V J'"'"''^'' ^^o"" 

Two recent studies have helped to IcSTe ^ofe o'f t/^ 
vector It was shown that «e/ could induce down reSlaUon 5 "^'"T"^^ 
expression at a post-transcriptional level and by a SS^Ctl fs [nH '"Z"^; 
CD4- serine phosphorylation (101). An in vivo studv of siv r '"'^^P^"^^"^ of 
derived from an infectiou«: anH r^Jh^r,^ ■ , . SIV„„ /ie/-negative viruses 
sible deletion of 82 bSe resuUed ,n a 1^^^^ ^'^^"^'^ ^^^^ ^" i^^ever- 

infected animals. Thes: a^S did J.T:^^::^^^ T ^P-^-tally 
d.d (102). Since CD4* is an accessory molecule ^o e^^^^^^ 

ion, the possibility that down-regulation of CD4 - mffht h.T ^ presenta- 
.ess vulnerable to cell-mediated filling mfy brconS:e';"' '"'^^^^^ 

^th^jrr^^^^^^^^^^ 

within the virion is apprSimriy eql'SoS?^^^^^^^^^^^ T'^"'- --P-^^^ted 

virion protein. Yet. HIV- 1 and SIVs in AfrJln „ '"^^'Cfting that it is an essential 
(SIV„„,) replicate ;fricientry w^hout vp ' ^ '"""^'^^ ^S^V-> '"^"drills 



VIRION ASSEMBLY 



n^emb"^^^^^^^^^^^^^ the inner cytoplasmic 

tylated at their amino terminus and Le tf us difected^o^th. '"'T'''^" 

cursors self-assemble (Fig 7) directed to the membrane. These pre- 

-i^^^S^^:^^ cell surface, but on the outer 

Thus, during the budding process dTrSed hv V ? ^^J transmembrane protein, 
cursors, both host lipid and vTr^ '^Z^^^^ self-assembly of the Gag pre- 

teins may interact perhaps via t^e fnll rZT '"e°'"P°'-ated. Gag and Env pro- 
proved. Assembly^rSfevLd wlt^or^ f ^^ ^^is has not been 

Gag-Pol Precurs<^?s (Cfre^iew of th^ in '° ''"^ "'r."^" ''''''' ^he Gag or 
The arrangement of Gag p 'telns!^^^^^^^ 103.) 
protein, and the matrix pr7e n-wi hrn^hl Ga^^^^^^^ 

where they should be within the viHon ?k ?^ P-^^^ursor ensures that they are 
center,the'maJorco'r:pre\Vstr;Sn;^^^^^ 

'ng the major core protein protein surround- 

meTlirn'or^olors^T^^ ^f^^'"'^ - -'^'eved first by di- 

secondary structur^w th^n^Ee 5 non^^n ''^"^'' completely mapped. A 

sequences in the aa. re.":: f.; implicated , as have 

elements are found only fn f^lI-IenttrRN?^ '.T^^"^^"' ^^"'"^^ ^^at these 

as v,7or env. lack these seouencef Th^ ' T '^"^ mRNAs, such 

dimerization and packaging " """^ "'"^'"^ P^°^«i" '"ay fac litate 
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Once budded, the virion breaks away from the cell surface ies.„.„„ o 
.mmature virion. For reasons not Vet'known ^he protel? 

cursor :s activated, resulting in cleavage of the Gag'^nd Gag-Pol precursors orese^^^^ 
n the immature v.non. This cleavage results in the condeLa^io^^f the co^ S 
foneT^i characteristic morphology of the mature infectious vh-ion As men 

sr^rrs^'^^^^^^^^^^^^ ~ - - produ^cLTo-f 



HIV QUASISPECIES: GENETIC AND ANTIGENIC DIVERSITY 

of^nv'"''''"^ ""^Pl'^^^'^"' •■"'l^^d all RNA viral replication, proceeds in the absence 
tWe m.r^ ^'^''■"^ mechanism. Consequently, polymerase errors, sich as nucleo 
atrrtrrerofTo-^"S m^^^^^^ r r^^^^ ""^'^^^^^ -islncorVo^at^n 
on^mni.^^^^^^ 

viral genomes are in general approximately 0.5-2 5x lO" bases in len«h 1' „ 

S dSct n«,T^ '"^"^ *" ~- Of wS'are^eT,! 

ically d stinct (104). Thus, any genetic description of an RNA virus stock or iLl«Ve 

al ^whn °f ^'^1 genonies were termed "viral quasispecies" by Eigen et 

popuTadons^^^^^^^^^^ ^^"^ °^ ^'^^ P^P^^'" expected of such 

populations (105). The quasispecies are so complex and the RNA viruses in aiie^tinn 

t£" ^ m "lore evident than for HIV (26,106-1 14) 

The challenge this diversity poses for biologic control of virus replication is daunt 
ng. From what we know of the copy numbers in an infected ind?Sn30) each 

Set" ThTc'o':" P™^'""-^' of which a' genedcaS 

distinct. The consequences for therapy and prevention are all too clear 

fw? T"^"r^"*°"' quasispecies tend to be homogeneous (26) Within 

tZX =°"^iderable variation appears-K:erta nly g^lJ^r 

than the genetic variation seen during an acute phase influenza A infect on (flS o^ 

Tod^f^^i^^wTirSm ^ '"^'"'"^ ^^""^ — combined' immu- 

. ^ '^"'■'^ °" variation of SIV sequences with time genetic 

complexity steadily increases after infection with virus of a single clone (l'l7 US 
In this respect the difference between influenza A and HIV is that the flu prJ;^^^^^^^ 
an acute mfection that is quickly resolved, as the individual has no natural rLervo^^^^^ 
for the virus As mentioned, acute-phase HIV infection is no different frorS acu e 
phase flu at the genetic level. However, as the HIV genome inte^tes „to ho^t ce i 
H^^' . ^ °f P°°^ immunologic surveillance. Along wUh 

hiding I self away m resting lymphocytes, variant genomes may be generated Tnd 
accumulated. Even though clearance of HIV virions may be efficient bfc^use of antf 

PBMysSowedThi^^^^^^^^^ '° ^'"''^ of sequences derived from 

Fmm th^rr? . '"^"y z^^/ variant sequences as clones sequenced 

From these data it was estimated that any two proviruses in vivo (at least for Se 
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samples in question) differed by at least 10 to 20 base substitutions (108). Even within 
a single organ, such as the spleen, the evidence of spatial inhomogeneity of HIV 
sequences was clear, not only in proviral copy number but also in sequence com- 
plexity (1 10). Furthermore, HIV variation is dynamic: HIV quasispecies may change 
in peripheral blood in as little as three months (106,107,111). 

HIV quasispecies are sensitive to selection pressures (1 19). In vitro culture of HIV 
from PBMCs invariably results in the isolation of a subset of sequences present in 
the original population (107), Even continued passage of an isolate or passage to 
different established CD4-^ cell lines results in tremendous selection (120). The emer- 
gence of drug-resistant forms within as little as six months of treatment is another 
remarkable example of the effects of selection pressure (121,122). 

Such extraordinary genetic diversity may not be necessary to AIDS pathogenesis, 
although some have suggested this to be the case (123). This diversity probably will 
not hamper our understanding of HIV gene function and replication, although cau- 
tion should always be exercised in the interpretation of biologic significance. It be- 
comes crucial as soon as one tries to intervene through virus culture, therapy, or 
prevention. 

Numerous studies have shown that neutralizing antibody is of low titer and essen- 
tially isolate specific (82), Group-reactive neutralizing antibodies have been de- 
scribed but they appear to be of even lower titer. Neutralizing antibody is directed 
primarily to the V3 loop (Fig. 7) of the gpl20 molecule. Although the tip of the loop 
is highly conserved, the adjacent sequences are highly variable (124). The V3 loop 
is a conformational epitope and neutralizing antibody escape mutants may result 
from changes within and without the loop. Frequently a single substitution is suffi- 
cient. 

Long-term prospects for efficient combination chemotherapy may be brighter. If 
the frequency of producing a drug-resistant variant is of the order of 10~\ then the 
probability of producing a variant against four or five independently acting drugs will 
be of the order of 10"'^ to 10 -2** Yet the population of HIV proviruses in an infected 
individual is probably fewer than 10''. Thus, by combining drugs, the probability of 
producing a resistant variant becomes so low as to be virtually impossible. 



CONCLUSION 

HIV, a redoubtable pathogen that will be with us for a long time, is a remarkably 
complex member of the lentiviral subfamily of retroviruses. In vivo, it infects T4 
lymphocytes and antigen-presenting cells, either through the CD4-' molecule or by 
Fc receptor-mediated antibody enhancement. Its replication cycle is efficiently 
adapted to that of the T4 lymphocyte: it remains transcriptionally silent in small 
resting cells. As soon as the T cell becomes activated, the powerful transactivation 
of the proviral genome leads to massive viral replication. 

Retroviral replication proceeds in the absence of proofreading mechanisms. Con- 
sequently, replication is accompanied by the production of variant genomes. In the 
case of a single HIV-infected individual, the sheer size of variants is on the order of 
10'' to 10". This ensures an endless source of antibody escape and drug-resistant 
variants. Prevention and therapy strategies will have to confront this problem. 
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